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ABSTRACT 

HI intensity mapping is a new observational technique to map fluctuations in the 
large-scale structure of matter using the 21 cm emission line of atomic hydrogen (HI). 
Sensitive HI intensity mapping experiments have the potential to detect Baryon Acous¬ 
tic Oscillations (BAO) at low redshifts {z < 1) in order to constrain the properties of 
dark energy. Observations of the HI signal will be contaminated by instrumental noise 
and, more significantly, by astrophysical foregrounds, such as Galactic synchrotron 
emission, which is at least four orders of magnitude brighter than the HI signal. Fore¬ 
ground cleaning is recognised as one of the key challenges for future radio astronomy 
surveys. We study the ability of the Generalized Needlet Internal Linear Gombina- 
tion (GNILC) method to subtract radio foregrounds and to recover the cosmological HI 
signal for a general HI intensity mapping experiment. The GNILC method is a new tech¬ 
nique that uses both frequency and spatial information to separate the components 
of the observed data. Our results show that the method is robust to the complexity 
of the foregrounds. For simulated radio observations including HI emission. Galactic 
synchrotron. Galactic free-free, radio sources and 0.05 mK thermal noise, we find that 
the GNILC method can reconstruct the HI power spectrum for multipoles 30 < ^ < 150 
with 6% accuracy on 50% of the sky for a redshift z ^ 0.25. 

Key words: methods: data analysis - radio continuum: general, galaxies - radio 
lines: ISM - cosmology: observations - large-scale structure of Universe 


1 INTRODUCTION 

There are several observational methods that can be used to 
constraint the properties of the dark energy, one of the most 
powerful of them is the Baryon Acoustic Oscillations (BAO) 
(Weinberg et al. 2013). Until now, all of the BAO observa¬ 
tions were made using redshift surveys performed in the op¬ 
tical and near infrared wavebands or using the 3-dimensional 
structure in the Lya forest absorption towards a dense grid 
of high-redshift quasars (Aubourg et al. 2015). Present sam¬ 
ple sizes for the detection of BAO are typically 10® - 10®, 
but there are a number of projects planned, such as Euclid 
(Laureijs et al. 2011) and LSST (LSST Dark Energy Sci¬ 
ence Collaboration 2012), to increase this to 10^ - 10® using 
optical and near-infrared observations. It is possible, on the 
other hand, to perform redshift surveys in the radio wave¬ 
band using 21 cm radiation from neutral hydrogen (HI) to 
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select galaxies (Chang et al. 2008). The use of a different 
waveband could improve the confidence in the results from 
the optical and near-infrared surveys. 

One way to do radio redshift surveys is through a tech¬ 
nique called HI intensity mapping (Madau, Meiksin, & Rees 
1997; Battye, Davies, & Weller 2004; Peterson, Bandura, & 
Pen 2006; Loeb & Wyithe 2008). Intensity mapping is the 
study of the large-scale fluctuations in the intensity of a 
given spectral line emitted by a number of unresolved ob¬ 
jects. It has been suggested that the full intensity field could 
be used to measure the power spectrum as a function of red¬ 
shift if the continuum emission, such as the one from our 
Galaxy, can be accurately subtracted and any systematic 
instrumental noise can be calibrated to sufficient accuracy 
(Peterson et al. 2009). The main advantage of HI intensity 
mapping, compared to the optical surveys of galaxies, is that 
a large sky volume is achieved within a relatively short ob¬ 
serving time. 

Using the HI intensity mapping method, the Green 
Bank Telescope (GBT) has made the first detection of the 
HI signal at z « 0.8. To have enough significance in the 
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detection of the faint HI signal, Masui et al. (2013) cross- 
correlated the 21 cm data obtained with the GET with the 
optical data obtained by the WiggleZ Dark Energy Survey. 
This detection showed that the HI intensity mapping is a 
very promising tool to study the large scale structure of the 
Universe. 

It is worth stressing that HI intensity mapping is a 
new observational technique that is still being developed. 
Ongoing or planned observational efforts, such as CHIME 
(Bandura et al. 2014), BINGO (Battye et al. 2013), TIAN- 
LAI (Chen 2012), FAST (Smoot & Debono 2014) and SKA 
(Santos et al. 2015), are essential for making this technique a 
competitive probe of the physical properties of the Universe. 

There are two main techniques for performing an HI 
intensity mapping experiment: single-dish and interferome¬ 
ter arrays. Both kinds of experiment will have to deal with 
potential systematics, such as gain variations and correlated 
noise in frequency. Single dish experiments will require sta¬ 
ble receiver systems and good calibration. Interferometers, 
on the other hand, are known to deal more naturally with 
systematics than single-dishes (Dickinson 2012; White et al. 
1999). It is worth noting, however, that existing interfer¬ 
ometers are limited by the small number of their smallest 
baselines and therefore do not provide the required surface 
brightness sensitivity (Bull et al. 2015). 

As already mentioned, the success of any HI inten¬ 
sity mapping experiment strongly depends on our ability 
to subtract the astrophysical contaminations that will be 
present in the observed HI signal. At ~ 1 GHz, the most 
relevant foregrounds are the Galactic emission, mostly syn¬ 
chrotron radiation, and the background emission of extra- 
galactic point sources. These emissions are at least four or¬ 
ders of magnitude larger, Tj ~ 10 K, than the HI signal, 
Ti, ~ 1 mK. The high spectral resolution offered by any HI 
intensity mapping experiment allows us to use the frequency 
information of the observed data. As the foregrounds spec¬ 
tra are expected to be smooth, we can approximate them by 
a power-law in the frequency range of interest. This prop¬ 
erty can then be used to separate the HI signal from any 
other signal correlated in frequency (Ansari et al. 2012; Liu 
& Tegmark 2012). 

There are other component separation techniques avail¬ 
able in the literature for HI intensity mapping experiments, 
both in the reionisation epoch and for low redshifts. Some 
of them, such as Karhunen-Loeve Decomposition (Shaw 
et al. 2014), are parametric methods. Parametric methods 
use a model to describe some physical properties of the 
foregrounds. Others, such as Principal Gomponent Analy¬ 
sis (PGA) (Masui et al. 2013; Switzer et al. 2013; Alonso 
et al. 2015; Bigot-Sazy et al. 2015), Independent Compo¬ 
nent Analysis (IGA) (Alonso et al. 2015), FASTIGA (Wolz 
et al. 2014), inverse variance (Liu & Tegmark 2011), and 
quadratic estimators (Switzer et al. 2015), use only the ob¬ 
served data to recover the HI signal and therefore do not 
assume a specific parametric model for the foregrounds. 

In this work, we study the application of the Gener¬ 
alized Needlet Internal Linear Combination (GNILG) (Re¬ 
mazeilles et al. 2011) as a non-parametric component sep¬ 
aration technique for HI intensity mapping experiments. In 
general, the GNILG method can extract the emission of a mul¬ 
tidimensional component (spatially correlated components) 
from the observed data. Originally, Remazeilles et al. (2011) 


used the GNILG method to obtain the total emission of the 
Galactic foregrounds from CMB data. Here we apply the 
GNILG method to a single-dish HI intensity mapping in low 
redshifts. 

We organize this paper as follows. In Section 2, we 
present the formalism of the GNILG method and show how 
it can be used in an HI intensity mapping experiment. In 
Section 3, we describe the models that we use to simulate 
the different components of the observed sky. In Section 4, 
we describe and discuss the results that we have obtained. 
Finally, in Section 5, we make our final remarks about the 
present work. 


2 GNILG METHOD 

The Internal Linear Combination (ILC) method was first 
used, in the CMB context, by the WMAP team in Bennett 
et al. (2003). The ILC method applies to the observed data a 
weight matrix W that offers unit response to the CMB emis¬ 
sion while it minimizes the total variance of the foreground 
plus noise signal. 

Bennett et al. (2003) used the ILC method in pixel 
space. Later, the ILC method was used in harmonic space by 
Tegmark et al. (2003) and in wavelet space by Delabrouille 
et al. (2009). Finally, Remazeilles, Delabrouille, & Cardoso 
(2011) developed an extension of the ILC method in wavelet 
space that makes use of a prior for the power spectrum of 
a specific component of the observed sky. This extension of 
the ILC method is called Generalized Needlet Internal Lin¬ 
ear Combination (GNILG). Here, we adapt the GNILG method 
to HI intensity mapping experiments. 

We model the sky observation, Xi(p), at frequency i and 
pixel (or direction of the sky) p, as 

Xi{p) = Si{p)+ ni{p), (1) 

where Si{p) is the HI signal and ni{p) is the astrophysical 
foregrounds plus instrumental noise contribution to the ob¬ 
served data. Eq. (1) can be rewritten in the rich x 1 vector 
form, where rich is the number of frequency channels, 

x{p) ^ s{p) + n{p). (2) 

Here x are the rich observation maps, each being a mixture 
of the HI signal s and the foregrounds plus noise component 

n. 

The rich x rich covariance matrix of the sky observations, 
R(p) = {x{p)x'^{p)) , at pixel p, is 

R(p) = Rhi(p) + Rn(p), (3) 

where Rhi(j3) = {s{p)s^{p)) is the covariance matrix of the 
HI emission and Rn(p) = {'n{p)n'^(p)) is the covariance ma¬ 
trix of the foregrounds plus noise. 

The foregrounds plus noise signal may significantly vary 
with the observed directions in the sky. The relative power 
ratio between foregrounds, noise, and HI signals also changes 
with the angular scale considered. Therefore, to describe 
theoretically the foreground emissions of the sky, we need 
a large number of degrees of freedom; this number of de¬ 
grees of freedom would be infinite in the case of a inhnitely 
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narrow beam. However, in practice, to describe the observed 
data, we are limited by the number of frequency channels rich 
of our experiment. Moreover, the foreground components of 
emission are correlated over frequencies, so that in prac¬ 
tice the foregrounds plus noise signal can be represented 
as a linear combination of a finite number m of independent 
(unphysical) templates. The HI signal s is also partially cor¬ 
related over adjacent frequencies (redshift bins), therefore it 
can also be described by a finite number of degrees of free¬ 
dom, which in practice would be given by rich — rn, where m 
is the dimension of the foreground subspace. In this way, the 
HI signal can be represented as the superposition of rich — m 
independent (unphysical) templates t, 

s = St, (4) 

where S is an rich x {rich — rn) mixing matrix giving the 
contribution from the templates to the HI emission in each 
frequency channel. We note that the templates t are not 
physical templates; instead they allow us to explore the 
(u-ch — rn) X (rich — rn) submatrix of the observation covari¬ 
ance matrix that is dominated by the HI signal. Therefore, 
using Eq. (4), we see that the covariance matrix of the HI 
signal is given by an rich x rich matrix with rank equal to 
rich - m, 

Rm = SRiS"^, (5) 

where Rt = {tH) is a full-rank (rich —m) x (rich —m) matrix. 

Now, we consider the estimation of s by a linear oper¬ 
ation 

's = \Nx, (6) 

where the rich x rich ILC weight matrix W offers unit response 
to the HI emission while minimizing the total variance of the 
vector estimate 's. This means that the matrix W minimizes 
E{\\Nx\'^) under the constraint WS = S. The weight matrix 
W thus solves the following minimization problem 

min I^Tr(WRW^)] with WS = S, (7) 

where R is the covariance matrix of observations x. This 
minimization problem, which is a multidimensional ILC 
problem, can be solved through the use of a Lagrange mul¬ 
tiplier, which gives 

W = S(S^R“^S)“^S^R~\ (8) 

It is important to notice that expression (8) for W is in¬ 

variant if S is changed into ST for any invertible matrix T. 
Hence, to implement the ILC filter Eq. (8), we only need to 
know the mixing matrix S up to a multiplication by an in¬ 
vertible factor (Remazeilles et al. 2011). In other words, we 
do not need to know the exact HI mixing matrix to perform 
component separation, but only the column space of S, i.e. 
the dimension rich — m of the HI subspace. Since the mix¬ 
ing matrix of the HI emission is unknown, this is a major 
advantage of the ILC filter compared to Wiener filters. 

As we have mentioned, the mixing matrix S is not 
known beforehand. Therefore, to be able to use the ILC 


filter given by Eq. (8), we need a method to estimate it, or 
more precisely to estimate its column space. To do this, we 
will perform a constrained version of the Principal Compo¬ 
nent Analysis (PCA) to the observation covariance matrix. 
Our constrained PCA algorithm, differently from the stan¬ 
dard PCA algorithm (Jolliffe 2002), will be driven by the lo¬ 
cal signal to noise ratio (ratio between the HI power and the 
total observation power) by making use of a prior on the HI 
power spectrum. No assumption is made on the foregrounds. 
The signal to noise ratio will be computed locally both in 
pixel space and in harmonic space by performing a wavelet 
decomposition of the data. This constrained PCA step in the 
GNILC algorithm will allow us to estimate, on different di¬ 
rections in the sky and on different angular scales, the local 
number of foregrounds plus noise degrees of freedom (prin¬ 
cipal components) and the local number of HI degrees of 
freedom (complementary components). To determine the di¬ 
mension of the HI (resp. foregrounds plus noise) subspace, 
we will also use a statistical information criterion to discrim¬ 
inate between the dominant eigenvalues that are due to the 
principal components and the eigenvalues that are due to 
HI degrees of freedom. Note that the prior on the HI power 
spectrum is only used for determining the dimension of the 
HI subspace at the constrained PCA step, not at the ILC 
filtering step. 


2.1 Estimation of the observation covariance 
matrix and the HI covariance matrix 

The covariance matrix of the observations for each pair of 
frequency channels a and b can be computed by the following 
expression, 

Rab{p) = (9) 

p'GT>(p) 

where 'D{p) is a domain of pixels centred around the pixel 
p. The choice of the domain T>(p) is done in such a way that 
we are able to avoid artificial correlations between the HI 
signal and the foregrounds plus noise signal (see Appendix 
A). 

In order to estimate the dimension (the column space) 
of the HI mixing matrix S used in the multidimensional ILC 
filter, Eq. (8), we adopt a prior on the HI power spectrum. 
By describing the statistics of the HI emission with its co- 
variance matrix, we are assuming that the HI emission is de¬ 
scribed by a Gaussian field. This is a reasonable assumption 
because any departure from Gaussianity of the HI emission 
is negligible compared to the large non-Gaussianity of the 
Galactic foreground emission. 

Using a theoretical template (prior) of the HI emission 
angular power spectra for each pair of frequency channels, 
we simulate HI maps for the different frequency chan¬ 
nels. From the simulated HI maps we compute, as in Eq. 
(9), the coefficients in real space of the HI covariance ma¬ 
trix Rhi ab for each pair of frequencies. 

In this work, for the theoretical prior, we use the model 
implemented in the software GORA (Shaw et al. 2014) for 
the HI power spectrum; this model, which considers only 
the fluctuations on the HI energy density, is discussed in 
section 3.1. As we show in section 4.3, the GNILC method is 
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not critically sensitive to the exact shape of the HI power 
spectrum but depends on the relative strength of the HI 
signal compared to the observed data; the prior HI power 
spectrum only serves to determine the effective dimension 
of the HI subspace (section 2.2). 

The computation of the observation covariance matrix 
and the HI covariance matrix is made independently on 
the different wavelet scales (ranges of multipoles) considered 
(see Appendix A). The use of wavelets allows us to deter¬ 
mine the relative strength of the HI power with respect to 
the foregrounds power both on localized areas of the sky and 
on defined ranges of angular scales. 


2.2 Determination of the HI signal subspace with 
constrained PCA 

To determine the HI signal subspace on each wavelet domain 
considered (i.e. a given pixel domain and a given range of 
angular scales), we perform, using the estimate of the HI 
covariance matrix Rhi, a constrained PCA that is driven by 
the local signal to noise ratio. 

First, we apply the following transformation to the fre¬ 
quency observations. 


- 1/2 —-\/2 

Rm RRhi 


[Ujv Us] X 


Ai + 1 


Am + 1 


U), 

I |T 


( 14 ) 


where for simplicity we take 1 = 1. The eigenvalues of the co- 
-1/2 ^-1/2 

variance matrix Rhi RRhi that are approximately equal 
to 1, therefore, contain the power of the HI signal, with 
the corresponding eigenvectors spanning the HI subspace. In 
this representation, the subset of eigenvectors that form the 
rich X (rich —m) matrix Us defines the independent templates 
that contribute to the HI signal. Conversely, the number m 
of eigenvalues significantly larger than 1 corresponds to the 
dimension of the foregrounds plus noise subspace, which is 
spanned by the set of eigenvectors collected in the matrix 
Uiv. 

We can write Eq. (14) in the following compact form 


-1/2 ^-1/2 

Rhi RRhi 


UjvDivU?; + UsU?, 


(15) 


where 


a: (10) 

such that the covariance matrix of the transformed observa¬ 
tions becomes 


Djv = diag[Ai -|- 1, ■ • ■ , A™ + 1] (16) 

is an m X m matrix. Therefore, using Eq. (15) and the or¬ 
thonormality condition UjvU^ -I- UsUs = I, the foregrounds 
plus noise covariance matrix can be written as 


^-1/2-1/2 

Rhi RRhi • (H) 

Using Eq. (3), the covariance matrix of the transformed ob¬ 
servations can be decomposed as 


h;-l/2„c;-l/2 

Rm RRhi 


S-1/2 

Rhi RuRhi 


Rhi Rhi Rhi • 


( 12 ) 


= R Rhi 

^ 1 / 2 ,- 1/2 ^- 1/2 

= Rhi (Rhi RRhi 


, ^1/2 
I)Rhi 


Rhi (Uiv(Div — I)U^)Rhi . 


(17) 


Given that the eigenvalues related to foreground components 
and collected in the diagonal matrix Dw are significantly 
larger than 1, Eq. (17) can be approximated by 


Assuming that the prior HI covariance matrix Rhi is close 
to the real HI covariance matrix Rhi, Eq. (12) becomes 


- 1/2 ^- 1/2 ^- 1/2 ^- 1/2 

Rhi RRhi = Rm RnRni + I, 


(13) 


R„~RHi"(UivDjvU?;)RHf. (18) 

As the complementary part of the data, the HI covariance 
matrix can then be represented as the orthogonal subspace 
of the foregrounds plus noise subspace. 


such that the power of the transformed HI signal is given by 

the matrix I, which is close to the identity matrix I, since 

-1/2 ^-1/2 

Rhi Rhi Rhi — I- 

By diagonalizing the transformed observation covari¬ 
ance matrix (Eq. (13)), we can separate the degrees of free¬ 
dom of the foregrounds plus noise from the degrees of free¬ 
dom of the HI signal. Eq. (13) shows us that the HI signal 
has approximately unitary power. Thus, the HI subspace 
will be defined by the subset of eigenvectors corresponding 
to the eigenvalues of the transformed observation covariance 
matrix that are approximately equal to 1. Taking this in 
consideration, we obtain the following eigenstructure for the 
transformed observation covariance matrix. 


^ ^1/2, 7., ^1/2 

Rm = Rm (UsU?)Rhi . (19) 

This is the rich x rich HI covariance matrix projected onto the 
(rich — m)-dimensional HI subspace spanned by the subset 
of eigenvectors collected in the matrix Us. As long as the HI 
emission can be effectively described by rich — rn degrees of 
freedom, this projection keeps the HI emission and discards 
the foregrounds plus noise signal from the data. The effec¬ 
tive number of degrees of freedom needed to describe the 
HI emission is limited by the number of available frequency 
channels, which for HI intensity mapping experiments will 
be large enough to ensure a safe separation of the HI and 
the foregrounds plus noise subspaces. 
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Using Eq. (19) and Eq. (4), we estimate the mixing 
matrix of the HI signal as the rich x {rich — rn) matrix 

S = Rm Us. (20) 

The estimate of the HI mixing matrix, Eq. (20), is the only 
information needed to implement the multidimensional ILC 
filter given by Eq. (8). More precisely, the column space Us 
is the only information needed to use the ILC filter. The ex¬ 
act amplitude of the prior Rhi is not critical for the ILC filter 
because the prior could be multiplied by a constant factor 
while leaving Eq. (8) unchanged. This is true as long as this 
constant factor is not large enough to modify the dimen¬ 
sion of the matrix Us- In separating the HI subspace and 
the foregrounds plus noise subspace, there is a possibility 
that a subdominant part of foregrounds plus noise signal is 
projected into the HI subspace. However, the multidimen¬ 
sional ILC minimizes the variance of the foregrounds plus 
noise signal that may be present in the HI subspace while 
guaranteeing unit response to the HI signal. 

The number rich — m of degrees of freedom of the HI 
emission is expected to vary as we calculate the transformed 
observation covariance matrix, Eq. (11), in different areas of 
the sky. For example, at low Galactic latitude this number 
may decrease compared to high Galactic latitude because 
the contribution from Galactic foregrounds to the observed 
signal becomes larger at this latitude. This number is also 
expected to vary with angular scale because, unlike fore¬ 
ground emission, the HI signal and the noise have more 
power on small angular scales. Therefore, to reconstruct ac¬ 
curately the HI emission, we should estimate the dimension 
of the HI emission subspace locally in space and in angu¬ 
lar scale. This is achieved by decomposing the data on a 
wavelet frame. The use of wavelets (needlets) is described in 
the Appendix A. 

2.3 Akaike Information Criterion (AIC) 

To find the local number of degrees of freedom rich — m of the 
HI emission in each wavelet domain, we make use of a statis¬ 
tical information criterion to discriminate between the eigen¬ 
values related to foregrounds and noise and the eigenvalues 
related to the HI emission, which in our representation (Eq. 
(14)) are close to 1. Estimating the dimension rich — rn of the 
HI emission subspace is equivalent to counting the effective 
number m of eigenvalues significantly larger than 1, which 
correspond to the foregrounds plus noise degrees of freedom. 
Instead of determining the number of principal components 
in an ad-hoc manner as in the standard PCA, the effective 
rank m of the foregrounds plus noise covariance matrix is 
estimated with the use of the Akaike Information Criterion 
(AIC) (Akaike 1974). 

For each location on the sky and each range of angular 
scales (wavelet domain), we can select the best rank value 
riij, for the foregrounds plus noise covariance matrix by min¬ 
imizing the AIC, 

AlC(m) = 2 nm — 2 log(/lmax(iTi)), (21) 

where n is the number of modes in the domain considered, 
m is the number of degrees of freedom of the foregrounds 
plus noise signal, and nm is the number of parameters in the 


model. /Imax(ni) is the maximum likelihood solution of the 
data covariance matrix given a model of m independent fore¬ 
ground components. For a given number m of independent 
foreground components, the maximum likelihood solution is 
given by Eq. (15). 

We note that, as the preferred value m is the one 
with the minimum AIC value, AIC rewards goodness of ht 
through the maximum likelihood function, but it also in¬ 
cludes a penalty that is an increasing function of the number 
of dimension of the foregrounds plus noise subspace. This 
penalty prevent us of overfitting the foreground plus noise 
subspace. 

In Appendix B, we show that the application of Eq. (21) 
to our problem of Ending the dimension of the foreground 
plus noise subspace in each region considered reduces to the 
following expression, 

"ch \ 

2m-I- ^ [/q - log/li - 1] I , (22) 

i=m-\-l / 

where are the eigenvalues of the transformed covariance 
matrix of the observed data Eq. (11). 


2.4 Summary of the GNILC algorithm 

The GNILC method presented above can be divided into two 
main steps. First, using a prior on the HI power spectrum, 
we determine the local signal to noise ratio and perform a 
constrained PCA of the observed data (section 2.2) to deter¬ 
mine the effective dimension of the HI subspace. Second, we 
perform a multimensional ILC filter within the HI subspace 
(Eq. (8)) and reconstruct the HI signal. In the constrained 
PCA step, the number of principal components of the obser¬ 
vation covariance matrix is estimated locally both in space 
and in angular scale by using a wavelet (needlet) decomposi¬ 
tion of the observations. We also use a statistical information 
criterion (AIC) to make the selection of the principal compo¬ 
nents of the observation covariance matrix (the eigenvectors 
of the foregrounds plus noise subspace). 

The steps of the GNILC method can be summarized as 
follows: 

1) To isolate the different ranges of angular scales (wavelet 
scales), we first dehne a set of needlet windows in harmonic 
space (see Appendix A). These needlet windows work as 
band-pass filters. The spherical harmonic coefficients a^m of 
the observed frequency maps are then band-pass filtered by 
the needlet windows. Therefore, when we transform back 
these coefficients into a map, we include statistical informa¬ 
tion only from a certain range of angular scales. We produce 
one observed map for each needlet scale j. 

2) For each needlet scale j, we compute the data covari¬ 
ance matrix, at pixel p, of a pair of frequencies a and b as 

Ra6(p) = ^<^{p)^b{p'), (23) 

p'eT>(p) 

where is a domain of pixels centred around the pixel p. 

3) For each needlet scale j, we also compute the HI emis¬ 
sion covariance matrix by using HI maps y simulated from 
a theoretical prior on the HI angular power spectrum. 
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RHIa6(p)= ya{p)yl{p')- (24) 

p'eT>(p) 

4) We diagonalize the transformed data covariance matrix, 

-l/2^^-l/2 

Rhi RRhi as 

Rm^"R(m)RH?^" = UnDnUI + UgU^, (25) 

where Div collects the m largest eigenvalues, U]v the corre¬ 
sponding eigenvectors, and Us the (n^h — rn) eigenvectors 
related to the HI emission subspace. 

The effective dimension m of the foregrounds plus noise 
subspace (number of principal components) is estimated by 
minimizing the AIC, 


( "ch \ 

2m + ^ [pi — log/Ti — 1] with m£[l,nch\, 

i = m+l / 

where pi are the eigenvalues of Rhi RRhi 

5) For each needlet scale j, we apply an (rich — m)- 
dimensional ILC filter to the observed data, 

=S(S^R“^S)"^S^R"'^a:('’\ (26) 

where the estimated mixing matrix is given by 

S = Rm Us. (27) 

6) Finally, we synthesize the reconstructed needlet HI 
mapsas follows: the maps are transformed to spherical 
harmonic space, their harmonic coefficients are again band¬ 
pass filtered by the respective needlet window and the fil¬ 
tered harmonic coefficients are transformed back to maps in 
real space. This operation gives one reconstructed HI maps 
per needlet scale. These maps are then added to give, for 
each frequency channel, the complete reconstructed HI map. 


3 SIMULATIONS 

At radio wavelengths (A > 21 cm), the astrophysical fore¬ 
grounds are several orders of magnitude brighter than the HI 
signal. For example, at i/ = 1 GHz, T^ky ~ 10 K, while the 
HI brightness temperature is Thi ~ 0.1 mK (Battye et al. 
2013). Here, the sky temperature Tsky is given by the sum 
of the CMB temperature, the diffuse Galactic radiation, the 
emission from extragalactic sources, and the HI emission. 
The GMB emission is not considered in our simulations be¬ 
cause it is well constrained by the available data (Planck 
Collaboration et al. 2015) and consequently it can be easily 
removed from the observed data. 

To test our component separation method we perform 
three different simulations for the observed sky. The set of 
simulations is summarized in Table 1. 

For each of the simulations, we consider the same band¬ 
width: 960 to 1260 MHz, which is the proposed range for the 
BINGO experiment (Battye et al. 2013). This bandwidth 
corresponds to a redshift range of 0.13 to 0.48. To study 
the effect of the number of frequency channels in the per¬ 
formance of GNILC method, we choose a small number of 



Figure 1. The observed sky at 1117.5 MHz of simulation 1 (HI 
signal, synchrotron with constant spectral index, and thermal 
noise) with our Galactic mask. 


Table 1. Simulations for the observed sky of an HI intensity 
mapping experiment. 


Simulation 

Components 

1 

HI, synchrotron radiation with con¬ 
stant spectral index, and thermal noise 

2 

HI, synchrotron radiation with con¬ 
stant spectral index, point sources, 
free-free radiation, and thermal noise 

3 

HI, synchrotron radiation with spa¬ 
tially variable spectral index, point 
sources, free-free radiation, and ther¬ 
mal noise 


channels and vary it between 6, 10, and 20. The frequency 
channels are chosen to be equally spaced in the given band¬ 
width. 

We consider circular Gaussian beams for the frequency 
channels of the experiment. For simplicity, independently of 
the number of frequency channels, we fix the full width at 
half minimum ^fwhm of the beam for the first frequency 
channel to be equal to 40 arcmin, which is the angular reso¬ 
lution of the BINGO telescope at 1 GHz (Battye et al. 2013). 
Considering that our HI intensity mapping is diffraction- 
limited, the full width at half minimum of the beam for the 
other frequency channels can be calculated through the fol¬ 
lowing relation. 


^fwhm(i^) 


^fwhm(i^o)-^, 


(28) 


where uq corresponds to the first frequency channel. Before 
applying the GNILC method to our simulations, we recon¬ 
volve all the maps to the same angular resolution, which we 
choose to be 40 arcmin. 

We use a Galactic mask to cover the area of the sky 
where the synchrotron emission of our Galaxy is brightest. 
Our Galactic mask is given by the product of two masks: a 
±20° latitude mask and a mask designed to cover the Galac¬ 
tic emission with brightness temperatures larger than the 
threshold of 30 K at 408 MHz. This combined mask is delib- 
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Figure 2. Maps of four simulated signals covering a 50° X 50° patch of the sky: (a) Galactic synchrotron with constant spectral index, 
(b) extragalactic point sources, (c) Galactic free-free, (d) and thermal noise. The maps are centred at Galactic coordinates (120°, —60°) 
and observed at 1117.5 MHz (redshift 2 : ~ 0.3). Note the very different linear scales used for each map. 


erately chosen to allow some moderately bright synchrotron 
emission to be present in the observed sky; this gives us 
a stringer test of the GNILC method. We also use a cosine 
apodization of width 3° in our Galactic mask. This mask 
gives 51% of sky coverage. Real experiments typically have 
smaller areas but this is only an example to study how the 
GNILC method performs with different observational skies 
and a signihcantly varying morphology of the foreground 
emission over the sky. The observed sky of simulation 1 at 
1117.5 MHz covered by our Galactic mask is shown in Fig. 1. 

To simulate our maps, we use the HEALPix package 
(Gorski et al. 2005) with a resolution N^ide = 128 and a 
maximum multipole £max = 2nside — 1 = 255. 

Figure 2 shows some maps of the components of the 
simulated sky at 1117.5 MHz: synchrotron, free-free, point 
sources, and the thermal noise. The temperature scale is in 
mK. The synchrotron radiation temperature is much higher 
than the temperature of the other foregrounds, roughly one 
order of magnitude larger than the point sources emission, 
which is the second dominant foreground. The main contri¬ 
bution from the point sources is in the form of an unresolved 
background. 


3.1 HI emission 

To simulate the HI emission, we use the software CORA 
(Shaw et al. 2014) and assume the cosmological model given 
in Planck Collaboration et al. (2014). We can compute the 
emitted brightness temperature of the 21 cm line in a veloc¬ 
ity width using (Furlanetto, Oh, & Briggs 2006) 


r„du= (29) 

Kb 16 vs 

where ks is Boltzmann constant. To is the HI brightness 
temperature in the emitter location, Uo — 1420.406 MHz is 
the rest-frame emission frequency for neutral hydrogen, nui 
is the number density of neutral hydrogen atoms, dZ is the 
line-of-sight distance, and A 21 is the spontaneous emission 
coefficient of the 21 cm transition. 

The mean observed brightness temperature due to the 
average HI density in the Universe is 


f{z) 


Tojz) 
(1 + ^) 


/ 3A2ihE \ f pHi(z) \ d[ 

) AH ^ ’ 


where mn is the mass of the hydrogen atom and Phi ( 2 ) 
is the average density of HI at redshift 2 . The hrst term 
is a combination of fundamental constants and measured 
experimental parameters, the second is a function of redshift 
and the last term relates the line-of-sight distance to the 
recession velocity. 

In the background level, the last term in Eq. (30) is 
given by 


dZ _ {1 + zf 
d7 “ H{z) 


In this way, Eq. (30) becomes 


(31) 


/ 3A2iZic^ \ (1 + z)^pm{z) 

\16fSkBmH) H{z) 


(32) 


In a linearly perturbed Universe, we can construct the 
2D angular power spectrum of the HI intensity over some 
frequency range. Assuming that Al/Av and 1 -I- 2 take the 
values in an unperturbed Universe, which means that we are 
ignoring the effects of peculiar velocities and the Sachs-Wolfe 
effect, we can obtain the 3D quantity ST{r{z)n, z) from Eq. 
(32) by replacing pHi with 5pHi. The projection on the sky 
of the temperature perturbation, 5T{n), is defined, using a 
window function W{z), which can be taken as uniform in 
the observed redshift range, in the following way (Battye 
et al. 2013) 


ST{n) — / AzW{z) 5T{r{z)n, z) 


J 

= J dzW{z)T{z)Sm{r{z)n,z), 


where fei = Spm/pm- 

Making a Fourier transform, we have 


( 33 ) 
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5f{r{z)n,z)=f{z) J 

^4TTf{z)J2'i‘ f ■A;^Sm(k,z)ji{kr{z)) 

X Y^*^(k)Yirn{n), (34) 

where ji{x) is the spherical Bessel function and Yem{x) are 
the spherical harmonics. Using this expression in Eq. (33), 
we have 


^r(n) =47r^/ f dzW{z)T{z) f ^Suiik,z) 

X je(kr{z))Ye*^(k)Yem(n), (35) 

which gives the harmonic coefficients 

atm = 4:7v£ J dzW{z)f{z) J d^^5Hiik,z)jeikr{z))Y£^(k). 

(36) 

The angular power spectrum Ci is defined by the ensemble 
average Ci = {a*^^airn)- Using the orthonormality condition 
of the spherical harmonics and the definition of the power 
spectrum, 


Table 2. Instrumental parameters for a single-dish simulation. 


Parameters 

Redshift range [zmin, Zmax] 

[0.13, 0.48] 

Bandwidth i^max] (MHz) 

[960, 1260] 

Number of feed horns rif 

80 

Sky coverage Hsur (deg^) 

21000 

Observation time fobs (yi’s) 

1 

System temperature Tgys (K) 

50 

Beamwidth at the first channel (arcmin) 

40 


frequency channels. The optimal sensitivity per pixel of a 
single-dish experiment can be defined as follows (Wilson 
et al. 2009) 


Tsya 


(40) 


where Au is the frequency channel width, Tsya is the sys¬ 
tem temperature, and tpix is the integration time per pixel 
defined by 


(SHiik,z)5mik',z’)) = (27r)^^(fc - k')b^PRk)D{z)D{z'), 

(37) 

where b is the bias between the spatial distribution of the HI 
and the dark matter. Pc is the underlying dark matter power 
spectrum, and D{z) is the growth factor for dark matter 
perturbations defined such that D{0) = 1, we obtain the 
angular power spectrum for the HI signal, 

Ct=‘^ j AzW{z)f{z)D{z) J dz'Wiz')T{z')D{z') 

X J DdkPc{k)ji{kr{z))ji{kr{z')). (38) 

Using small redshift bins, Azi and Azj , centered in the red- 
shifts Zi and Zj, we can write this angular power spectrum 
for the pair of frequencies vt and Vj corresponding to the 
redshifts Zi and Zj as 


tpix = riftobs^^, (41) 

i Lsut 

where rif denotes the number of feed horns, tabs is the total 
integration time, Haur is the survey area, and Hpix = ^fwhm 
is the beam area. The values that we use for these param¬ 
eters are those of the BINGO experiment, except the sky 
coverage, which is larger and thus gives a larger noise am¬ 
plitude per pixel. The main parameters of our experiment 
can be seen in Table 2. 

For the case with 20 frequency channels, the amplitude 
of noise is equal to 0.05 mK. Later, we also consider the 
case of this amplitude to be equal to 0.08 mK. Because the 
total bandwidth of the experiment is fixed, when we decrease 
the number of frequency channels, we appropriately decrease 
the amplitude of the noise by using the relation between the 
amplitude of noise and the frequency channel width given 
by Eq. (40). 


Ctij = —f dzWiz)T{z)D{z) f dz'W{z')T{z')D{z') 

J ^Zi J 

X J k^dkPc{k)je{kr{z))ji{kr{z’)). (39) 

Using this angular power spectrum, the software CORA pro¬ 
duces maps of the HI signal with r.m.s fluctuations around 
0.1 mK. 

3.2 Thermal noise 

For the instrument, we consider only thermal noise, which 
we assume to respect a uniform Gaussian distribution over 
the sky. This means that we are not considering 1/f noise 
and other sources of systematic errors. For simplicity, we 
also take the amplitude of noise to be constant across the 


3.3 Synchrotron radiation 

Synchrotron emission arises from energetic charged particles 
moving in a magnetic held (Rybicki & Lightman 2004). In 
our Galaxy, these magnetic helds extend well outside the 
Galactic plane. For this reason, synchrotron emission is less 
concentrated in the Galactic plane than free-free radiation. 

The frequency scaling of synchrotron hux emission is in 
the form of a power law, P QC v°‘. In terms of the Rayleigh- 
Jeans brightness temperature, we have T oi , with jd = 
(a —2). Typically, at GHz frequencies, j3 = —2.8±0.2, and is 
variable across the sky (Platania et al. 1998; Reich & Reich 
1988; Davies et al. 1996). 

For the synchrotron radiation we use the reprocessed 
Haslam map at 408 MHz (Remazeilles et al. 2015), which 
includes small-scale huctuations. We rescale this map to the 
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appropriate frequencies using two models. The first one is 
a simple power-law that relates the brightness temperature 
of this radiation to the observed frequency: T oc . We 
choose P = —2.8 because it is the average value at GHz 
frequencies as given by Platania et al. (1998). The other 
model considers that the synchrotron spectral index is spa¬ 
tially variable. We use the model given by Miville-Deschenes 
et al. (2008), which used WMAP intensity and polarization 
data to do a separation of the Galactic components. In this 
model the synchrotron spectral index has a mean value of 
P = —3.0 and a standard deviation of ap = 0.06. 


3.4 Free-free radiation 

Free-free emission arises from the interaction of free elec¬ 
trons with ions in ionized media, and is called ’’free-free” 
because of the unbound state of the incoming and outgoing 
electron (Rybicki & Lightman 2004). At radio frequencies, 
this comes from warm ionized gas with typical temperature 
of Te ~ 10"^ K. 

Theoretical calculations of free-free emission in an elec¬ 
trically medium consisting of ions and electrons give the 
brightness temperature at frequency v as (Dickinson et al. 
2003) 



Figure 3. Power spectrum, £(£-|-l)C^/27r, on a logarithmic 
scale, of the synchrotron with constant spectral index (black), 
point sources (pink), free-free (red), HI (blue), and thermal noise 
(green) at 1117.5 MHz. These power spectra are calculated on 
51% of the sky. 


Tff 90mK 


(TA 

-0.35 ^ ^ 0 

Y1 

^ EM \ 

U) 

IghzJ ' 

ycm~®pc J 


(42) 


where EM is the emission measure representing the inte¬ 
gral of the electron density squared along the line-of-sight, 
EM = J Uedl, and P is the spectral index, which is around 
- 2 . 1 . 

The free-free spectrum is well-defined by a power-law 
with a temperature spectral index P = —2.1, which acts 
to flatten the spectral index of the total continuum of our 
Galaxy where it has a brightness temperature comparable to 
that of the synchrotron emission. At intermediate and high 
Galactic latitudes (|&| > 10°), the EM can be estimated us¬ 
ing Ha measurements, which can then be converted to radio 
brightness temperature assuming that Te is known and the 
dust absorption is small. To estimate the EM and obtain the 
free-free map of our Galaxy, we use the Ha map of Dickinson 
et al. (2003) with Te = 7000 K. 


3.5 Point sources 

We assume that the distribution of radio point sources is 
not spatially correlated and hence that they are Poisson dis¬ 
tributed (Liu et al. 2009). Extragalactic point sources can 
be divided into two populations: bright and isolated point 
sources, which can be detected by the instrument and re¬ 
moved directly using the observed data, and a continuum of 
unresolved sources. 

We use the model of Battye et al. (2013) to simulate 
point sources. The brightness of each source is drawn from 
the differential source counts dN/dS, where N is the num¬ 
ber of sources per steradian and S is the flux. In Battye 
et al. (2013), the authors use data from multiple continuum 
surveys at 1.4 GHz (see the references in the cited paper) 
and fit a 5th order polynomial to these data. 


logio 


l'S‘^-^dN/dS\ 
\ No ) 


±a. 

i=0 




(43) 


where oq = 2.593, oi = 9.333 x 10 a 2 = —4.839 x 10 
03 = 2.488 xl0"\ 04 = 8.995 X 10“^ and 05 = 8.506 x 10“^ 
We also have A^o = 1 Jy^^^sr“^ and 5o = 1 Jy. 

The spectral distribution of the sources is given by a 
power-law function. 


SM.S(I.4GH.)(j^)", (44) 

where the spectral index a is randomly chosen from a Gaus¬ 
sian distribution. 


P{a) 


1 


exp 


(a-ao)2 

2cr2 


(45) 


with a mean equals to ao = —2.5 and a variance equals to 
(j = 0.5 (Tegmark et al. 2000). 

Assuming that the sources with flux S > Smax can 
be subtracted from the data, the brightness temperature is 
given, for each pixel p, by the sum of all the point sources 
contained on it. 


Tp^iwp) (^y^ . (46) 

^ 1 = 1 

where S* is the flux of the point source i at 1.4 GHz, N 
is the number of point sources in pixel p, flsky = ^fwhm 
with 6 'fwhm being the full width at half minimum of the 
beam of the map at frequency u, and dB/dT — 2fc_B/A^ 
is the conversion factor between intensity and brightness 
temperature, with ks being the Boltzmann constant and A 
the wavelength of the observed radiation. Eor our simulation, 
we assume Smax = 100 mjy. 
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-0.23 


0.30 mK 


Figure 4. Maps covering a 50° X 50° patch of the sky with three different signals: (a) input HI, (b) GNILC reconstructed HI, and (c) 
residuals. The maps are centred at Galactic coordinates (120°, —60°) and observed at 1117.5 MHz (redshift z ~ 0.3). 


4 RESULTS 
4.1 Basic results 

The angular power spectra of the simulated astrophysi- 
cal foregrounds, HI emission, and thermal noise are shown 
in Fig. 3. These power spectra are computed with the 
HEALPix routine anafast (Gorski et al. 2005) on the sky 
covered with the apodized Galactic mask of Fig. 1. We see 
that the synchrotron radiation is dominant on all scales and 
has more power on large angular scales {i < 50). The extra- 
galactic point sources have the opposite behavior, with its 
main contribution coming from high multipoles £ > 100. The 
free-free radiation, on the other hand, has an almost con¬ 
stant power spectrum over the range of scales (0 < ^ < 250) 
and the fraction of the sky considered. Finally, the thermal 
noise and the HI emission have more power on small scales 
{£ > 150). 

We now present the results of the application of the 
GNILC method to separate the HI signal from the astrophys- 
ical and instrumental contamination of a general HI inten¬ 
sity mapping experiment. Figure 4 shows, for simulation 1 
(synchrotron with constant spectral index, HI, and thermal 
noise only) with 20 frequency channels, the input HI signal, 
the reconstructed HI signal and the residual contamination 
(recovered HI signal minus input HI signal) at 1117.5 MHz. 
These maps are shown with an angular resolution of 40 ar- 
cmin. We see that, after cleaning foregrounds and thermal 
noise of the observed signal, the GNILC method is able to re¬ 
construct the temperature fluctuations of the HI signal with 
a r.m.s residual equals to 0.04 mK at this 50° x 50° patch of 
the sky. There remains some low-level large-scale residuals, 
which are also visibile in the power spectra on the largest 
angular scales {£ < 30). 

To have a quantitative measure of the ability of the 
GNILC method to recover the HI signal, we plot, in Fig. 5, 
the power spectra of the input HI signal, the reconstructed 
HI signal, and the residual map on a linear scale. 

Although we have one reconstructed HI map for each 
frequency, to summarize our results, we show the power 


spectrum for the frequency that is closest to the middle point 
of the band. Given that the band extends from 960 MHz to 
1260 MHz, for the case of 6, 10, and 20 frequency channels, 
the central frequency channel is equal to 1135 MHz, 1125 
MHz, and 1117.5 MHz, respectively. 

Figure 5 shows that the GNILC method reconstructs, 
without significant bias, the input HI power spectrum for 
30 < ^ < 150. In the case of 20 frequency channels, the 
residual map power spectrum is around 12.5% of the input 
HI power spectrum for a range of multipoles that goes from 
30 to 150. Also, for the three different numbers of frequency 
channels, the local features of the HI power spectrum are 
reconstructed with good accuracy. 

When we increase the number of frequency channels, we 
see that the GNILC method performance improves. This im¬ 
provement with the number of frequency channels is clearer 
when we plot the normalized difference between the input 
power spectrum, Cf, and the reconstructed power spectrum, 
C^, of the HI signal. 


ACt 


/^S /^R 

Cf 


(47) 


This plot is given in Fig. 6. We see that over a mul¬ 
tipole range 30 < £ < 120, the HI power spectrum is bet¬ 
ter reconstructed by the GNILC method with 10 and 20 fre¬ 
quency channels than with 6 channels. In this range of mul¬ 
tipoles, the average normalized absolute difference between 
the input power spectrum and the reconstructed power spec¬ 
trum of the HI signal is 10.9%, 6.0%, and 4.9% for 6, 10, 
and 20 frequency channels, respectively. The improvement 
of the GNILC method with the increase of the number of fre¬ 
quency channels can be justified with the help of Eq. (14). 
With more frequency channels, there is more freedom for 
the GNILC method to fit for the independent components 
of emission because now the effective number of indepen¬ 
dent degrees of freedom, m, required to properly describe 
the foregrounds plus noise signal is no longer limited by 
a small number of observations. Therefore, with more fre- 
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6 channels, 1135.0 MHz 


10 channels, 1125.0 MHz 


20 channels, 1117.5 MHz 





Multipole I Multipole I Multipole / 


Figure 5. Results on simulation 1 (HI, synchrotron with constant spectral index, and thermal noise) with 6, 10, and 20 frequency 
channels: power spectra, + 1)Ci/2tt, of the simulated HI signal (black), the recovered HI signal (green), and the residual signal (red) 
at frequency 1135.0 MHz, 1125.0 MHz, and 1117.5 MHz, respectively. 



Multipole I 

Figure 6. Results on simulation 1 (HI, synchrotron with con¬ 
stant spectral index, and thermal noise): normalized difference 
between the input power spectrum, Cf, and the reconstructed 
power spectrum, C^, of the HI signal at frequency 1135.0 MHz, 
1125.0 MHz, and 1117.5 MHz with 6 (black), 10 (green), and 20 
(blue) frequency channels, respectively. 


quency channels, we are able to remove foregrounds from 
the observed data more efficiently. 

For multipoles greater than 120, the reconstruction with 
20 frequency channels has a greater dispersion (in this case, 
the average normalized absolute difference between the in¬ 
put power spectrum and the reconstructed power spectrum 
of the HI signal is equal to 8.9%) than the reconstruction 
with 10 frequency channels (average normalized absolute dif¬ 
ference is equal to 4.2%). This happens because the experi¬ 
ment with 20 frequency channels has a larger level of instru¬ 
mental noise in the observed data than the experiment with 
10 frequency channels (see Eq. (40)). Thus a direct com¬ 
parison between the two experiments is irrelevant on small 
scales. 

For simulation 2, we add radio point sources and free- 
free radiation to simulation 1. This increases the complexity 
of the observed sky. In this case, the foregrounds plus noise 
signal has more degrees of freedom than in the case of simu¬ 
lation 1. Figure 7 shows the power spectra of the recovered 
HI signal, the input HI signal, and the residual signal for 
simulation 2 with 6, 10, and 20 frequency channels. 

As for the case in simulation 1, when we increase the 
number of frequency channels for simulation 2, the GNILC 
method performance improves, now with a more noticeable 
effect. For a range of multipoles that goes from 30 to 150, the 
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6 channels, 1135.0 MHz 


10 channels, 1125.0 MHz 


20 channels, 1117.5 MHz 




Multipole l Multipole I 



Multipole I 


Figure 7. Results on simulation 2 (HI, synchrotron with constant spectral index, point sources, free-free, and thermal noise) with 6, 
10, and 20 frequency channels: power spectra, (-{(- + l)Cil2-K, of the simulated HI signal (black), the recovered HI signal (green), and 
residual signal (red) at frequency 1135.0 MHz, 1125.0 MHz, and 1117.5 MHz, respectively. 



Multipole I 

Figure 8. Results on simulation 2 (HI, synchrotron with constant 
spectral index, point sources, free-free, and thermal noise): nor¬ 
malized difference between the input power spectrum, Cf, and 
the reconstructed power spectrum, C^, of the HI signal at fre¬ 
quency 1135.0 MHz, 1125.0 MHz, and 1117.5 MHz with 6 (black), 
10 (green), and 20 (blue) frequency channels, respectively. 


residual map power spectrum is around 34.7%, 24.3%, and 
18.7% of the input HI power spectrum for the case with 6, 10, 
and 20 frequency channels, respectively. This improvement 
of the GNILC method with the increase of the number of 
frequency channels can again be justified with the help of Eq. 
(14). As the presence of point sources and free-free increases 
the number of degrees of freedom of the foregrounds plus 
noise signal, we need more frequency channels, rich, to be 
able to remove efficiently these extra degrees of freedom from 
the observed signal. 

Figure 8 shows the normalized difference between the 
input power spectrum, Cf, and the reconstructed power 
spectrum, Cf, of the HI signal for simulation 2. For the 
case of 20 frequency channels, we obtain an average normal¬ 
ized absolute difference around 6.3% for multipoles in the 
range between 30 and 150. 

In simulation 3, we consider the observed sky of simu¬ 
lation 2 but make the synchrotron radiation spectral index 
to be spatially variable. The results for this simulation are 
consistent with those of simulation 2. For the case of 20 fre¬ 
quency channels, we obtain an average normalized absolute 
difference between the input power spectrum, Cf, and the 
reconstructed power spectrum, Cf^, of the HI signal that is 
around 6.1% for multipoles in the range between 30 and 150. 
Figure 9 shows the power spectra of the recovered HI signal. 
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20 channels, 1117.5 MHz 



Figure 9. Result on simulation 3 (HI, synchrotron with spa¬ 
tially variable spectral index, point sources, free-free, and thermal 
noise): power spectra, £{£ + l)Ci/2iT, of the simulated HI signal 
(black), the recovered HI signal (green), and the residual signal 
(red) at frequency 1117.5 MHz with 20 frequency channels. 

the input HI signal, and the residual map for simulation 3 
with 20 frequency channels. 

In Table 3, we summarize the values of the average nor¬ 
malized absolute difference between the input power spec¬ 
trum, Cf, and the reconstructed power spectrum, C/*, of 
the HI signal for our simulations. 

In the case where the observed sky of a real experiment 
is more complex than the sky of simulation 3, the number 
of degrees of freedom required to describe the foregrounds 
plus noise signal may be larger. Consequently, to reconstruct 
accurately the HI power spectrum, the GNILC method may 
require more than 20 frequency channels. As HI intensity 
mapping experiments such as BINGO can have ~ 1000 or 
more channels, the increase, for a real experiment, in the 
number of degrees of freedom of the foregrounds plus noise 
signal will not be a limitation for the GNILC method. 


4.2 Instrumental effects 

The thermal noise amplitude, for 20 frequency channels, 
at = 0.05 mK, chosen for our simulations is of the cor¬ 
rect order of magnitude for a single-dish experiment such as 
BINGO (Bigot-Sazy et al. 2015). To see the effect of a larger 
amplitude for the thermal noise, we simulate again the ob¬ 
served sky of simulation 1 (HI, synchrotron with constant 
spectral index, and thermal noise) but now with a thermal 
noise amplitude at = 0.08 mK. 

In Fig. 10, we show the power spectra of the recovered 
HI signal, the input HI signal, and the residual map for 
different noise levels at 1117.5 MHz. In this figure, we have 
the results for simulation 1 with no thermal noise and with 
thermal noise amplitudes at equal to 0.05 and 0.08 mK. 

We see that some residual thermal noise signal contam¬ 
inates the recovered HI signal. This effect is significant at 
small scales (£ > 80), where the recovered HI signal is en¬ 
hanced, compared to the case without thermal noise, in the 


Table 3. The average absolute difference between the input and 
the GNILC reconstructed HI power spectrum normalized by the 
input HI power spectrum for multipoles in the range between 30 
and 150. 


Number of channels 

Simulation 1 

Simulation 2 

Simulation 3 

6 

0.11 

0.30 

0.27 

10 

0.06 

0.15 

0.14 

20 

0.06 

0.06 

0.06 


cases with thermal noise amplitudes equal to 0.05 and 0.08 
mK. As expected, the larger the thermal noise amplitude 
is, the less accurate is the reconstruction of the HI power 
spectrum on small angular scales. 

However, in a real experiment, if we are able to have 
a good estimate of the thermal noise power spectrum, then 
the GNILC method is able to recover the HI plus the thermal 
noise signal from the observed data. To do this, instead of 
using the HI covariance matrix, Rhi, in the transformation 
of Eq. (10), we can use Rhi + Rnoise as a prior. In this case, 
we are able to explore the HI signal plus noise subspace and 
therefore to recover the HI signal plus thermal noise. In this 
scenario, to minimize the noise bias in the reconstructed 
power spectrum, we can correct for the thermal noise af¬ 
terwards by using the estimate of the instrumental noise 
power spectrum in each frequency channel and then obtain 
a reconstructed HI power spectrum with minimal noise con¬ 
tamination. 

We also consider the possibility that the telescope cre¬ 
ates systematic errors that result in non-ideal calibration of 
the data. To simulate this in a simplihed manner, we added 
a small source of fluctuations in the spectral response. These 
fluctuations might represent standing waves in the telescope 
structure or frequency variations in the receiver gain temper¬ 
ature. We parametrize this fluctuations with an additional 
oscillatory term to the measured brightness temperature as 
a function of frequency, 

Ar,(iz)=Asin(^^), (48) 

where A = 100 mK, v is the frequency of observation and 
Auosc = 10, 100, and 300 MHz. When AiZosc is larger than 
the frequency bin of the experiment (15 MHz for 20 fre¬ 
quency channels in our experiment), we have a curvature 
in the frequency spectrum, which is similar to the effect of 
standing waves in a real experiment. On the other hand, 
when Avosc is smaller than the frequency channel width, it 
adds a noise-like term, similar to the HI signal itself. This 
could represent random bandpass calibration errors, for ex¬ 
ample. 

For a range of multipoles that goes from 30 to 150, 
the average absolute difference between the input and the 
reconstructed HI power spectrum normalized by the input 
HI power spectrum is 5.95%, 5.99%, and 5.93% for the case 
with AiZosc equals to 10, 100, and 300 MHz, respectively. In 
Fig. 11, we show the result for Afoac = 10 MHz. We see 
that the GNILC method is robust to the possibility of the 
telescope having a non-smooth response to the synchrotron 
radiation frequency spectrum. This shows that the GNILC 
method is able to detect the extra degrees of freedom in the 
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Figure 10. Impact of thermal noise: power spectra, £{£ + l)C^/27r, of the simulated HI signal (black), the recovered HI signal (green), 
and the residual signal (red) for simulation 1 (HI, synchrotron with constant spectral index, and thermal noise) at frequency 1117.5 MHz 
for different thermal noise amplitudes. 


foregrounds plus noise subspace given by the parameters A 
and Ariose and properly describe this subspace. In general, as 
long as the angular power spectrum of the systematic effects 
behaves differently from the angular power spectrum of the 
HI emission, the GNILC method is able to detect the degrees 
of freedom related to the different systematic effects that are 
usually present in an HI intensity mapping experiment. 


4.3 Prior effects 

The GNILC method makes use of a prior of the theoreti¬ 
cal HI power spectrum to estimate the dimension of the HI 
subspace. In this section, we study how the GNILC method 
performs with different priors for the HI power spectrum. 
First, we use a prior for the HI power spectrum that does 
not contain any fluctuation, i.e a smooth HI power spec¬ 
trum, but that has the correct average amplitude for the HI 
signal. We also study the effect of having a smooth prior that 
over or underestimate the HI signal. Moreover, we neglect 
in the prior any possible cross-correlations between different 
frequency channels (cross power spectra). 

Three examples of these incorrect priors for the HI 
power spectrum are shown in Fig. 12. In these simulations, 
we consider the observed sky of simulation 1 and 20 fre¬ 
quency channels. Figure 12 shows, for each prior considered. 


Av„3, = 10MHz 



Figure 11. Impact of non-smooth response on frequency: power 
spectra, £{£ + l)C^/27r, of the simulated HI signal (black), the re¬ 
covered HI signal (green), and the residual signal (red) for the 
simulation with non-smooth response of the telescope to the syn¬ 
chrotron radiation with Ai/oac = 10 MHz. We use the observed 
sky of simulation 1 (HI, synchrotron with constant spectral index, 
and thermal noise). 
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Multipole / Multipole I Multipole / 

Figure 12. Sensitivity to incorrect priors: HI prior (blue), input HI signal (black), recovered HI signal (green), and residual signal (red) at 
frequency 1117.5 MHz for the simulation 1 (HI, synchrotron with constant spectral index, and thermal noise) with 20 frequency channels 
when considering different priors for the HI power spectrum. The prior in the first panel is given by a smooth HI power spectrum with 
the correct average amplitude for the HI signal, the prior in the second panel is given by a smooth spectrum that overestimates the HI 
signal by a factor of 2.5, and the prior in the third panel is given by a smooth spectrum that underestimates the HI signal by a factor 
of 2. 


the input HI power spectrum, the prior HI power spectrum, 
the reconstructed HI power spectrum, and the power spec¬ 
trum of the residual signal. 

We see that the GNILC method is not critically sensitive 
to the absence of local features in the prior HI power spec¬ 
trum. The GNILC reconstruction, however, depends on the 
overall amplitude of the HI power spectrum. This depen¬ 
dency on the amplitude of the HI signal happens because 
the GNILC method uses the relative power of the HI signal 
compared to the observed signal in determining the dimen¬ 
sion of the HI subspace. 

When our prior underestimates the HI power spectrum 
by a factor of 2, we cannot properly recover the HI signal. On 
the other hand, when the prior overestimates the HI power 
spectrum by a factor of 2.5, we are still able to recover the HI 
power spectrum with good accuracy. This difference of be¬ 
havior is due to the fact that the amplitude of the HI signal 
is smaller than the Galactic foreground radiation by many 
orders of magnitude. Therefore, when the prior is overesti¬ 
mating the HI signal by a factor of 2.5, the amplitude of the 
prior power spectrum remains much smaller than the fore¬ 
ground radiation. Consequently, the constrained PCA step in 
GNILC algorithm estimates the same dimension for the HI 


subspace as the case of a prior with the correct amplitude. 
The same does not happen when the prior underestimates 
the HI signal. As now the amplitude of the HI prior is com¬ 
parable to the noise, the constrained PCA analysis needs to 
add an additional dimension to the foregrounds plus noise 
subspace at the cost of loosing part of the HI signal in this 
dimension. In this case, the GNILC method cannot recover 
the HI signal with acceptable accuracy. 

There is, however, a limit to the overestimation of the 
HI signal that we can adopt in the prior. The reason for this 
is that, in our representation of the observed data, Eq. (14), 
the HI subspace is characterized by the eigenvalues that are 
close to 1. A significant overestimation of the prior on the 
HI power spectrum makes the eigenvalues of the observation 
covariance matrix that are due to foregrounds plus noise to 
be of the same order of magnitude than those that are due 
to HI signal. In this case, the reconstructed HI power spec¬ 
trum suffers from strong foregrounds contamination because 
the GNILC method is not able to properly separate the HI 
subspace from the foregrounds plus noise subspace. 

By studying different smooth priors for the HI power 
spectrum, we can dehne the range of uncertainty on the 
prior of the HI power spectrum that is acceptable to accu- 
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Figure 13. Comparison of GNILC with standard PCA: power spectra, t(t + l)C^/ 27 r, of the input HI signal, the GNILC recovered HI 
signal, and the PCA recovered HI signal with different number of principal components. 


rately recover the HI emission with GNILC. We find that we 
can overestimate the HI power spectrum by a factor of 25 
or underestimate it by a factor of 1.5 and the GNILC method 
is still able to reconstruct the HI power spectrum with good 
accuracy (average absolute difference between the input and 
the reconstructed HI power spectrum normalized by the in¬ 
put HI power spectrum smaller than 10%). 

The GNILC method can in principle be applied to a CO 
intensity mapping experiment as long as the amplitude of 
the cosmological CO power spectrum can be known with an 
uncertainty of no more than 50%. The current constraints on 
the amplitude of the CO signal are still not accurate enough 
to provide a reliable prior for the CO power spectrum that 
is needed in the GNILC method; Li et al. (2015) compared 
the CO power spectra of different models available in the 
literature and showed that they can vary by 2 orders of 
magnitude. 

4.4 Comparison of GNILC with standard PCA 

We now compare the performance of the GNILC method with 
the standard PCA method. The standard PCA method, to¬ 
gether with the use of a transfer function to correct for HI 
signal loss, has been used for real data (Masui et al. 2013; 
Switzer et al. 2013) and is commonly used in HI intensity 


mapping simulations (Alonso et al. 2015; Bigot-Sazy et al. 
2015). For this comparison, we use simulation 3, which con¬ 
sists of synchrotron radiation with spatially variable spectral 
index, point sources, free-free radiation, and thermal noise. 

In the standard PCA, the principal components (fore¬ 
grounds) are determined by looking at the largest eigenval¬ 
ues of the observation covariance matrix and isolating the 
corresponding eigenvectors. With no prior assumption on 
the HI power spectrum, the optimal number of principal 
components detected by PCA analysis is generally given by 
a constant on the whole range of angular scales considered. 
The PCA thus implicitly assumes that the ratio between the 
HI power spectrum and the foreground power spectrum is 
uniform over all the angular scales, which is not the case. 
Therefore fixing the number of principal components as a 
constant on all the scales is equivalent to making a wrong 
assumption on the shape of the HI power spectrum. 

In our simulation 3, the standard PCA recovers the HI 
signal with smallest bias when we choose the number of 
principal components to be equal to 3 (Fig. 13). We note that 
the number of detected principal components depends on the 
complexity of the foregrounds and noise being simulated. 

The main advantage of the GNILC method is that, unlike 
the PCA, the number of principal components is estimated 
locally and therefore varies with angular scale and location 
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Table 4. The average absolute difference between the input and 
the reconstructed HI power spectrum normalized by the input 
HI power spectrum for the GNILC method and the PCA with 3 
principal components. 


Range of multipoles 

GNILC 

PCA 

1 - 15 

0.93 

0.19 

15 - 30 

0.50 

0.12 

30 - 45 

0.09 

0.14 

45 - 60 

0.05 

0.15 

60 - 75 

0.06 

0.17 

75 - 90 

0.05 

0.22 

90 - 105 

0.04 

0.25 

105 - 120 

0.04 

0.27 

120 - 135 

0.06 

0.32 

135 - 150 

0.07 

0.38 


on the sky. The reconstructed HI power spectra for the PCA, 
with different (2, 3 and 9) principal components, and for the 
GNILC method are shown in Fig. 13. In Table 4, we compare, 
for different ranges of multipoles, the performance of the 
PCA with 3 principal components and of the GNILC method. 
For each range of multipoles, we calculate the average ab¬ 
solute difference between the input and the reconstructed 
HI power spectrum normalized by the input HI power spec¬ 
trum. We see that the PCA performance is worse than the 
GNILC performance by a factor of 5 on most angular scales 
{£ > 30). For very large angular scales, £ < 30, PCA looks 
to better match the input HI power spectrum than GNILC. 
This, however, only happens because the PCA gives an ar¬ 
bitrary number of dimensions to the HI subspace on these 
scales that is sufficient to reconstruct the HI power spec¬ 
trum with good accuracy. Note also that, as a consequence 
of its strictly blind analysis, the PCA cannot have a strong 
confidence in the reconstructed power on these scales. 

Depending on the number of principal components that 
are removed, the PCA either underestimate the HI power 
spectrum or is strongly contaminated by residual fore¬ 
grounds. It is never able to accurately measure the HI power 
spectrum over the whole range of scales. This is particularly 
true in our simulation where a large area of the sky is ob¬ 
served. For being large, our observed sky includes significant 
variations of the foregrounds with respect to the HI signal 
over the sky and over angular scales. The reason for the 
GNILC method to reconstruct more accurately the HI power 
spectrum than the PCA method for most of the angular scales 
is exactly what differentiates them: the number of principal 
components is locally determined, driven by the local signal 
to noise ratio, in harmonic space by the GNILC method, while 
this number is fixed in all angular scales in the PCA method. 
Therefore, the GNILC method is able to consider the varia¬ 
tions with angular scale of the foregrounds plus noise signal 
in reconstructing the HI signal, while the PCA method is not. 


5 CONCLUSIONS 

In this work, we have introduced a new component separa¬ 
tion technique for an HI intensity mapping experiment: the 
Generalized Needlet Internal Linear Combination (GNILC) 
method. As the GNILC method works in a wavelet space, it 
uses angular and spatial information to recover the HI signal 
from the observed data. Also, as the GNILC method is able to 
explore the HI signal subspace of the observation covariance 
matrix, it allows us to recover the HI signal (not HI signal 
plus thermal noise) from the observed data. 

To test the GNILC method in a diverse set of experi¬ 
mental conditions, we performed several simulations for a 
general HI intensity mapping experiment in low redshifts. 
For the astrophysical foregrounds, we simulated synchrotron 
radiation, extragalactic point sources and free-free radia¬ 
tion. For the instrumental noise, we considered only thermal 
noise. We also studied the possibility that the experimental 
setup creates systematic errors in the relation between the 
synchrotron brightness temperature and frequency, making 
this relation to non-longer be a smooth function. 

For the set of simulations of an HI intensity mapping 
experiment in low redshifts that we have considered, we were 
able to show that the GNILC method is robust to the com¬ 
plexity of the foregrounds. We have found that we can re¬ 
cover the cosmological HI power spectrum for multipoles 
£ > 30 with very good accuracy. As the number of frequency 
channels of the experiment increases, the GNILC method im¬ 
proves in its ability to separate the HI signal from the astro- 
physical foregrounds and the instrumental noise. This is due 
to the fact that the GNILC method is able to adapt to the 
number of degrees of freedom of the foregrounds plus noise 
signal: when the number of frequency channels increases, the 
GNILC method is able to use, if necessary, more degrees of 
freedom to describe the foregrounds plus noise signal, result¬ 
ing in a better reconstruction of the HI signal. 

To estimate locally on the sky and on the angular scales 
the dimensions of the foregrounds plus noise and HI sub¬ 
spaces, the GNILC method uses a prior for the HI power 
spectrum. We have considered different incorrect priors for 
the HI power spectrum and studied their effect on the abil¬ 
ity of GNILC to recover the HI signal. Our results show that, 
even if we use a prior for the HI angular power spectrum to 
estimate the HI covariance matrix that does not have any 
fluctuation, the GNILC method is still able to reconstruct the 
local features of the HI power spectrum. The GNILC method, 
however, depends on the amplitude of the HI power spec¬ 
trum. This is due to the fact that the GNILC method uses the 
relative power of the HI signal compared to the observations 
to determine the dimension of the HI signal subspace. 

We compared the GNILC method with the PCA method. 
For all multipoles greater than 30, we recovered a more ac¬ 
curate HI power spectrum with GNILC than with PCA. 

Our results show that the GNILC method is robust to 
the complexity of the different astrophysical foregrounds. 
The GNILC method is, therefore, a promising non-parametric 
component separation method for HI intensity mapping ex¬ 
periments. Though we have considered low redshifts for 
our simulations, this method can in principle be applied to 
high redshifts and therefore be a useful foreground cleaning 
method for experiments probing the Epoch of Reionization. 
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APPENDIX A: NEEDLETS 

Wavelets are a collection of functions with properties close to 
those of a basis. The main difference between wavelets and 
a vector basis is that wavelets are redundant (Daubechies 
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1992). Needlets were introduced as a particular construc¬ 
tion of a wavelet family on the sphere by Narcowich et al. 
(2006) and Guilloux et al. (2009). Needlets can be inter¬ 
preted as a set of band-pass filters in harmonic space. The 
most important property of the needlets is that they have 
excellent localisation in the spherical harmonic domain. 

We define the set of needlets windows such that, over 
the useful range of multipole £, we have 

= 1, (Al) 

j 

where j corresponds to a specific range of multipoles and 
characterizes a particular needlet window. Maps of needlet 
coefficients are obtained, for each observed map x{p), by 
inverse spherical harmonic transform (SHT) of the harmonic 
coefficients xim of the observed map filtered by the needlet 
passband tip, 


t 

P'^\P)=^ ^ Xlrr,hPYlm{p), (A2) 

^ m=—£ 

where Yim are the spherical harmonics. 

For each range of multipoles j and for each pixel p of 
the corresponding needlet coefficient maps xP (one map for 
each frequency a), we can compute the covariance matrix 
R from an average of the product of needlet coefficients. 
This average is done in a domain Dp centred at pixel p and 
including some neighbouring pixels. This local average in 
pixel space, together with the property of the needlets of 
being almost local in spherical harmonic space, allows us to 
be approximately local in space and angular scale. The ab 
component of the covariance matrix is then 


f^PiP) = ^ E xP{p')xPiPp (A3) 

P'6®, 


The domain Dp is defined by convolving the product of maps 
with a symmetric Gaussian window in pixel space. 

For each needlet scale j, the number of pixels nP, 
which determines the domain Dp used in Eq. (A3), can be 
constrained in the following way. When performing compo¬ 
nent separation, we may encounter the problem of ILC bias 
discussed in Delabrouille et al. (2009). On small domains of 
the sky the statistics are computed on a reduced number 
Np of modes so that it can creates artificial anti-correlations 
between the component of interest and the contaminants. 
This unphysical correlations may induce a power loss in the 
HI signal reconstruction. The loss of the HI power is quanti¬ 
fied by the multiplicative ILC bias b, derived in Delabrouille 
et al. (2009), 


6 = 


^ch 


N„ 


1 


(A4) 


where rich is the number of frequency channels and Nm is 
the number of modes in the domain considered. Therefore, 
to control the ILC bias, there is a minimum size for the set 
of data points on which the ILC should be implemented. We 
can thus constrain the number of pixels nP in the domain 
Dp used to compute the local covariance in such a way to 
maintain the ILC bias b fixed for each needlet scale j. 


APPENDIX B: DERIVATION OF THE AKAIKE 
INFORMATION CRITERION 

For a given dimension m of the foregrounds plus noise sub¬ 
space, we assume that the data x are described by a Gaus¬ 
sian distribution with covariance matrix R(m) = RHi(m) -I- 
Rn(rri). Therefore the likelihood distribution of the data is 


/:(a;|R(m)) = 


\/27r det R(m) 


exp 


1 Ta~lf \ 

--XkR (m)xk 


(Bl) 


where n is the number of modes in the needlet domain con¬ 
sidered. The log-likelihood can be rewritten as 


—21ogT = a;^ R ^(m) aj^, — logdet R ^ (m)-I-constant 

k = l 

= nK ^R, R(m)^ -I- constant, (B2) 

where K ^R, R(m)^ is the Kullback-Leibler divergence 
(Kullback & Leibler 1951), which measures the mismatch 
between the model covariance matrix R(m) and the data 
covariance matrix R, 


K (^R, R(m)^ = Tr (^RR~^(m)^ - log det (^RR~^(m)) - rich. 

(B3) 

For a given dimension m of the foregrounds plus noise 
subspace, the covariance matrix R(m) that minimizes the 
Kullback-Leibler divergence (or maximize the likelihood), 
Eq. (B3), is given by the sum of Eq. (18) and Eq. (19). 

For each location on the sky and angular scale con¬ 
sidered, we can then select the best rank value mt for the 
foregrounds plus noise covariance matrix by minimizing the 
Akaike Information Criterion (AIC), 


AlC(m) = 2nm — 21og(£max(rri)). (B4) 


Since the transformed data covariance matrix can be diago- 

-l/2^^-l/2 J, 

nalized as Rhi RRhi — UDU , where, as in Eq. (14), 


U = [Uv Us] and D = 


and the maximum likelihood happens at 
Uiv(Div — l)U5i + I, we have that 


0 

Ds 


(B5) 


^- 1/2 ^- 1/2 
Rhi RRhi 
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— nK ( Rjji RRhi 1 UAr(D]v — l)Ujv + I 


-21og(£inax(m)) = 

( 

= nK (^UDU^, U]v(Djv - l)U^ + l) 

^D,U^ [U]v(Djv - l)U5^ + l] U 


= nK 

= nK 

= n ( Tr 



D]v 0 

! 

0 1 


Djv 0 
0 Ds 

I 0 

0 Ds 

"ch 

+ m- log Pi - rich 


— log det 

"ch 


I 0 

0 Ds 


i = m+l 


i = m+l 


= ”• - 1], 

i = m+l 



(B6) 


where pi are the eigenvalues of the transformed covariance 
matrix of the observed data. Substituting Eq. (B6) in the 
expression for the AIC, Eq. (B4), we obtain 


AlC(m) = n 


2m + 


"ch 

[Pi - log Pi 


2 = m+l 



(B7) 


The dimension of the foregrounds plus noise subspace m 
is then estimated by minimizing Eq. (B7) in each region 
considered. 
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